Using the infection estimates from healthdata.org we compute fractional change metrics as an estimate of the new infection rate plus the recovery and death rates. Under the assumption that the studied countermeasures do not effect recovery and death rates, we construct a Bayesian model to estimate the effectiveness of each restriction cataloged in the healthdata.org dataset in reducing the infection rate.
The data is stored in two CSV files. The primary file contians dated estimates for each included location of values such as total active infections, burden on ICUs, etc. The second contains aggregate statistics and start and end dates for each of the countermeasures.
healthdata = pd.read_csv('2020_05_28/Hospitalization_all_locs.csv', parse_dates=['date'])
stats = pd.read_csv('2020_05_28/Summary_stats_all_locs.csv', parse_dates=[
'peak_bed_day_mean', 'peak_bed_day_lower',
'peak_bed_day_upper', 'peak_icu_bed_day_mean', 'peak_icu_bed_day_lower',
'peak_icu_bed_day_upper', 'peak_vent_day_mean', 'peak_vent_day_lower',
'peak_vent_day_upper', 'travel_limit_start_date', 'travel_limit_end_date',
'stay_home_start_date', 'stay_home_end_date',
'educational_fac_start_date', 'educational_fac_end_date',
'any_gathering_restrict_start_date', 'any_gathering_restrict_end_date',
'any_business_start_date', 'any_business_end_date',
'all_non-ess_business_start_date', 'all_non-ess_business_end_date',
])
# We melt the start and end dates and drop the missing data so we can determine
# whether or not a policy was in place using an asof merge to the most recent change
countermeasures = stats.melt(
id_vars=['location_name'],
value_vars=[c for c in stats.columns if 'date' in c],
value_name='date'
).dropna()
countermeasures['type'] = countermeasures['variable'].apply(
lambda name: '_'.join(name.replace('-', '_').split('_')[:-2])
)
countermeasure_types = sorted(set(countermeasures['type']))
countermeasures['enforced'] = countermeasures['variable'].apply(lambda name: name.split('_')[-2] == 'start')
countermeasures.drop(columns=['variable'], inplace=True)
countermeasures.sort_values('date', inplace=True)
countermeasure_types
['all_non_ess_business', 'any_business', 'any_gathering_restrict', 'educational_fac', 'stay_home', 'travel_limit']
merged = healthdata.copy()
# healthdata uses projections for the past two weeks just as they do for future dates
# as a mitigation for lag in reporting. To look only at estimates for days with data,
# discard anything newer than two weeks before the latest data.
merged.dropna(subset=['confirmed_infections'], inplace=True)
merged = merged.loc[merged['date'] < merged['date'].max() - pd.to_timedelta('2w')]
# Regions that are decomposed are stored here, too, but without countermeasure data.
# Therefore we discard the groups and keep their subdivisions
group_names = ['Brazil', 'Canada', 'Germany', 'Italy', 'Mexico', 'Spain', 'United States of America']
merged = merged.loc[~merged['location_name'].apply(group_names.__contains__)]
# to reduce noise, only investigate growth once a location has at least N infections
N = 200
merged = merged.loc[merged['est_infections_mean'] > N]
# Drop locations with fewer than N samples
enough = merged.groupby('location_name')['V1'].count() >= 40
merged = merged.merge(enough, left_on='location_name', right_index=True)
merged = merged.loc[merged['V1_y']]
merged.sort_values('date', inplace=True)
for countermeasure_type, subset in countermeasures.groupby('type'):
subset = subset.drop(columns='type').rename(columns={'enforced': countermeasure_type})
merged = pd.merge_asof(merged, subset, on='date', by='location_name')
merged[countermeasure_type].fillna(False, inplace=True)
cols = ['est_infections_mean', 'est_infections_lower', 'est_infections_upper']
diffs = merged.groupby('location_name')[cols].diff().rename(columns={c: f'diff_{c}' for c in cols})
merged = pd.concat([merged, diffs], axis=1)
merged['fractional_diff_est_infections_mean'] = merged['diff_est_infections_mean'] / merged['est_infections_mean']
merged['day_of_year'] = merged['date'].dt.dayofyear
merged['day_of_week'] = merged['date'].dt.dayofweek
merged['random'] = np.random.normal(size=len(merged))
The enforcement of each category of countermeasure is hilighted in red. Note the limited prevalence of such measures in the first weeks of data and the relative infrequency of limitations on travel.
plot
The following plot shows the frequency of each countermeasure where true for enforced countermeasures is up and to the right. Note how there are no samples where any_business is False but all_non_essential_business is True since the latter is simply a stricter level of the former. The same is true of any_gathering_restrict and stay_home.
plot